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Abstract 

A DNA triplex is farmed when pyrimidine or purine bases occupy the major 
groove of the DNA double Helix forming Hoogsteen pairs with purines of the 
Watson-Crick basepairs. Intermolecular triplexes are formed between triplex 
forming oligonucleotides (TFO) and target sequences on duplex DNA, Intra- 
molecular triplexes are the major elements of H-DNAs, unusual DNA struc- 
tures, which are formed in homopurine-homopyrimidine regions of supercoUed 
DNAs. TFOs are promising gene-drugs, which can be used in an anti-gene 
strategy, that attempt to modulate gene activity in vivo. Numerous chemical 
modifications of TFO are known. In peptide nucleic acid (PNA), the sugar- 
phosphate backbone is replaced with a protein-like backbone. PNAs form 
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P-loops while interacting with duplex DNA forming triplex with one of DNA 
strands leaving the other strand displaced. Very unusual recombination or 
parallel triplexes, or R-DNA, have been assumed to form under RecA protein 
in the course of homologous recombination, 

PERSPECTIVES AND SUMMARY 

Since the pioneering work of Felsenfeld, Davies, & Rich (1), double-stranded 
polynucleotides containing purines in one strand and pyrimidines in the other 
strand [such as poly(A)/poly(U), polyCdAypolyCdT), orpoly(dAG)/poly(dCT)] 
have been known to be able to undergo a stoichiometric transition forming a 
triple-stranded structure containing one polypurine and two polypyrimidine 
strands (2-4). Early on, it was assumed that the third strand was located in the 
major groove and associated with the duplex via non-Watson-Crick interactions 
now known as Hoogsteen pairing. Triple helices consisting of one pyrimidine and 
two purine strands were also proposed (5, 6), However, notwithstanding the fact 
that single-base triads in tRNA structures were well-documented (reviewed in 7), 
triple-helical DNA escaped wide attention before the mid-1980s. 

The considerable modern interest in DNA triplexes arose due to two partially 
independent developments. First, h omop urine- horn opyri mid ine stretches in 
supercoiled plasmids were found to adopt an unusual DNA structure, called 
H-DNA, which includes a triplex as the major structural element (8, 9). Sec- 
ondly, several groups demonstrated that homopyrimidine and some purine-rich 
oligonucleotides can form stable and sequence-specific complexes with corre- 
sponding homopurine-homopyrimldine sites on duplex DNA (10-12). These 
complexes were shown to be triplex structures rather than D-loops, where the 
oligonucleotide invades the double helix and displaces one strand. A charac- 
teristic feature of all these triplexes is that the two chemically homologous 
strands (both pyrimidine or both purine) are antiparallel. These findings led to 
explosive growth in triplex studies. 

During the study of intermolecular triplexes, it became clear that triplex-form- 
ing oligonucleotides (TFOs) might be universal drugs that exhibit sequence-spe- 
cific recognition of duplex DNA. This is an exciting possibility because, in 
contrast to other DNA-binding drugs, the recognition principle of TFOs is very 
simple: Hoogsteen pairing rules between a purine strand of the DNA duplex and 
theTFO bases. However, this mode of recognition is limited in that homopurine- 
homopyrimidine sites are preferentially recognized. Though significant efforts 
have been directed toward overcoming this limitation, the problem is still 
unsolved in general. Nevertheless, the high specificity of TFO-DNA recognition 
has led to the development of an *'antigene M strategy, the goal of which is to 
modulate gene activity in vivo using TFOs (reviewed in 13). 

Although numerous obstacles must be overcome to reach the goal, none are 
likely to be fatal for the strategy. Even if DNA TFOs proved to be unsuitable 
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as gene-drugs, there are already many synthetic analogs that also exhibit 
triplex-type recognition. Among them are oligonucleotides with non-natural 
bases capable of binding the duplex more strongly than can natural TFOs. 
Another promising modification replaces the sugar-phosphate backbone of 
ordinary TFO with an uncharged peptidelike backbone, called a peptide nucleic 
acid (PNA) (reviewed in 14). Homopyrimidine PNAs form remarkably strong 
and sequence-specific complexes with the DNA duplex via an unusual strand- 
displacement reaction: Two PNA molecules form a triplex with one of the 
DNA strands, leaving the other DNA strand displaced (a "P-loop") (15, 16). 

The ease and sequence specificity with which duplex DNA and TFOs formed 
triplexes seemed to support the idea (17) that the homology search preceding 
homologous recombination might occur via a triplex between a single DNA 
strand and the DNA duplex without recourse to strand separation in the duplex. 
However, these proposed recombination triplexes are dramatically different 
from the orthodox triplexes observed experimentally. First, the recombination 
triplexes must be formed for arbitrary sequences and, second, the two identical 
strands in this triplex are parallel rather than antiparalleL Some data supported 
the existence of a special class of recombination triplexes, at least within the 
complex among duplex DNA, RecA protein, and single-stranded DNA (re- 
viewed in Ref. 18), called R-DNA. A stereochemical model of R-DNA was 
published (19). However, the structure of the recombination intermediate is far 
from being understood, and some recent data strongly favor the traditional 
model of homology search via local strand separation of the duplex and D-loop 
formation mediated by RecA protein. 

Intramolecular triplexes (H-DNA) are formed in vitro under superhelical 
stress in horaopurine-homopyrinudine mirror repeats. The average negative 
supercoiling in the cell is not sufficient to induce H-DNAformationin most cases. 
However, H-DNA can be detected in vivo in association with an increase ofDN A 
supercoiling driven by transcription or other factors (reviewed in 20). H-DNA 
may even be formed without DNA supercoiling during in vitro DNA synthesis. 
Peculiarly, this DNA polymerase-driven formation of H-DNA efficiently pre- 
vents further DNA synthesis (21, 22). There are preliminary indications that 
H-DNA may also terminate DNA replication in vivo (23). More work is required, 
however, to elucidate the role of H-DNA in biological systems. 

STRUCTURE, STABILITY, AND SPECIFICITY OF DNA 
TRIPLEXES 

Triplex Menagerie 

Since the original discovery of oUgoribonucleotide-formed triplexes, numerous 
studies have shown that the structure of triplexes may vary substantially. First, 
it was shown that triplexes may consist of two pyrimidine and one purine strand 
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YB*Y YR'R Alternate Strand 




Figure J Triplex menagerie (see text for explanations). Solid lines, purine strands; stippled lines, 
pyrimidiae strands; vertical lines, Watson-Crick hydrogen bonds; diamonds, Hoogiteen hydrogen 
bonds. Arrows indicate DNA chain polarity. 



(YR*Y) or of two purine and one pyrimidine strand (YR*R). Second, triplexes 
can be built from RNA or DNA chains or their combinations. Third, triplexes 
can be formed within a single polymer molecule (intramolecular triplexes) or 
by different polynucleotides (imermolccular triplexes). Finally, for special 
DNA sequences consisting of clustered purines and pyrimidines in the same 
strand, triplex formation may occur by a strand-switch mechanism (alternate 
strand triplexes). Figure 1 summarizes numerous possible structures of triple- 
helical nucleic acids. 

The building blocks of YR*Y triplexes are the canonical CG*C and TAT 
triads shown in Figure 2. To form such triads, the third strand must be located 
in the major groove of the double helix that is forming Hoogsteen hydrogen 
bonds (24) with the purine strand of the duplex. The remarkable isomorphism 
of both canonical triads makes it possible to form a regular triple helix. This 
limits YR* Y triplexes to homopurine-homopyrimidine sequences in DNA. An 
important feature of the YR*Y triplexes is that formation of the CG*C triad 
requires the protonation of the N3 of cytosine in the third strand. Thus, such 
triplexes are favorable under acidic conditions (3, 4). By contrast, the YR*R 
triplexes usually do not require protonation (see below). 

The mutual orientation of the chemically homologous strands in a triplex 
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Figure 2 Canonical base triads of YR*Y triplexes: TA*T and CO*C*. 



(i.e. two pyrimidine strands in the YR*Y triplex or the two purine strands in 
the YR*R triplex), which a priori can be either parallel or antiparallel, is of 
paramount importance. Hie discovery of H- and *H-DNA (see below) indi- 
cated that both YR*Y and YR*R triplexes form as antiparallel structures (9, 
25). A thorough investigation of intermolecular triplexes by different methods 
unambiguously demonstrated that both YR*Y and YR*R triplexes are stably 
formed only as antiparallel structures. The most direct data were obtained by 
cleaving target DNA with ho mopyrimidinc or homopurine oligonucleotides 
attached to Fe*EDTA (11, 26). The observed pairing and orientation rules 
rigorously determine the sequence of the triplex-forming homopyrimidinc 
strand, 

YR*R triplexes are more versatile than YR*Y triplexes. Originally it was 
believed that they must be built from CG*G and TA*A triads (6, 25, 27). Later 
work showed, however, that TA*T triad may also be incorporated into the 
otherwise YR*R triplex. Moreover, the stability of triplexes consisting of 
alternating CG*G and TA*T triads is higher than that of triplexes built of 
CG*G and TA*A triads (26). Thus, the term YR*R triplex, though routinely 
used in literature, is misleading with regard to the chemical nature of the third 
strand. The corresponding triads are presented in Figure 3. One can see that 
they are not strictly isomorphous, as was the case for YR*Y triads. Another 
notable difference between two triplex types is that reverse Hoogsteen base- 
pairs are needed to form reasonable stacking interactions among CG*G, TA*A, 
and TA*T triads (26). 

Whereas in YR*Y triplexes the sequence of the third strand is fully deter- 
mined by the sequence of the duplex, the situation is different for YR+R 
triplexes. Here the third strand may consist of three bases, G, A, and T, where 
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the guanines oppose guanines in the duplex, while adenines or thymines must 
oppose adenines of the duplex. A protonated CG*A + triad (Figure 3) forms so 
that A in the third strand may also oppose G in the duplex at acidic pH (28). 

Another novel feature of YR*R triplexes is that their stability depends 
dramatically on the presence of bivalent metal cations (reviewed in 20). Unlike 
the case of YR* Y triplexes, where the requirement for H + ions has an obvious 
reason, the metal dependence of YR*R triplexes is an obscure function of the 
particular metal ion and the triplex sequence (29). Possible structural reasons 
for selectivity of bivalent cations in stabilization of YR*R triplexes are dis- 
cussed in Ref. 30. 

Despite these differences, the YR*R triplexes are similar to YR*Y triplexes 
in their most fundamental features: (a) The duplex involved in triplex forma- 
tion must have a homopurine sequence in one strand, and (b) the orientation 
of the two chemically homologous strands is antiparallel. 
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One can easily imagine numerous "geometrical" ways to form a triplex* and 
those that have been studied experimentally are shown in Figure 1. The ca- 
nonical intermole cuter triplex consists of either three independent oligonucle- 
otide chains (3, 4) (Figure lA t F) or of a long DNA duplex carrying the 
homopurine-homopyrimidine insert and the corresponding oligonucleotide 
(10-12). In any case, triplex formation strongly depends on the oligonucleo- 
tide(s) concentration. 

A single DNA chain may also fold into a triplex connected by two loops 
(Figure \B,G). To comply with the sequence and polarity requirements for 
triplex formation, such a DNA strand must have a peculiar sequence: It con- 
tains a mirror repeat (homopyrimidine for YR*Y triplexes and homopurine 
for YR*R triplexes) flanked by a sequence complementary to one half of this 
repeat (31). Such DNA sequences fold into triplex configuration much more 
readily than do the corresponding intermolecular triplexes, because all triplex- 
forming segments are brought together within the same molecule (31-33). 

There is also a family of triplexes built from a single strand and a hairpin. 
Two types of arrangements are possible for such structures (34, 35): (a) a 
canonical hairpin, formed by two self-complementary DNA segments, is in- 
volved in Hoogsteen hydrogen bonding with a single strand (Figure lC,f/), 
and (£>) a "hairpin" containing a homopurine (for YR*R) or homopyrimidine 
(for YR+Y) mirror repeat is involved in both Watson-Crick and Hoogsteen 
basepair formation in a triplex (Figure ID,/). A peculiar modification of this 
scheme was described in Refs. 36 and 37, where a short circular oligonucle- 
otide could be used for triplex formation instead of a hairpin (Figure IE,J). 
Such a triplex-forming oligonucleotide is of particular interest for DNA tar- 
geting in vivo (see below), since circular oligonucleotides are not substrates 
for degradation by exonucteases. 

The structures in Figure 1 are intentionally ambiguous with regard to the 5' 
and 3' ends of polynucleotide chains. In fact, all these structures may exist as 
two chemically distinct isoforms differing in relative chain polarity. The com- 
parative stability of the two isoforms is poorly known. Very recent data 
presented in Refs. 38 and 39 indicate that their free energies may differ by 
1.5-2.0 kcal/mol and may depend on the loop sequence. The two isoforms 
may also differ topologically (see below). 

So far, we have considered triplexes with their duplex part consisting of 
purely homopurine and homopyrimidine strands (the influence of individual 
mismatched triads is discussed below). It has become clear recently, however, 
that both sequence requirements and chain polarity rules for triplex formation 
can be met by DNA target sequences built of clusters of purines and 
pyrimidines (40-43) (see Figure IK-M). The third strand consists of adjacent 
homopurine and homopyrimidine blocks forming Hoogsteen hydrogen bonds 
with purines on alternate strands of the target duplex, and this strand switch 
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preserves the proper chain polarity. These structures, called alternate-strand 
triplexes, have been experimentally observed as both intra- (42) (Figure IM) 
and intermolecular (41, 43) (Figure 1K,L) triplexes. These results increase 
the number of potential targets for triplex formation in natural DNAs 
somewhat by adding sequences composed of purine and pyrimidine clusters, 
although arbitrary sequences are still not targetable because strand switching 
is energetically unfavorable. Preliminary estimates give the minimal length 
of a cluster in an alternate-strand triplex as between 4 and 8 (44). A peculiar 
feature of altemate-strand triplexes is that two different sequences of the 
third strand fulfill the requirements for triplex formation for a single duplex 
target (Figure \K t L). For a few studied targets, the efficiency of triplex 
formation by the two variants was quite different. Strand switching in the 
direction 3'-R n -Y„-5' along the third strand was more favorable than 3'-Y„- 
R a -5' (44). 

Hybrid triplexes consisting of both DNA and RNA chain are less studied 
and only for YR*Y triplexes. Eight combinations of RNA and DNA chains 
within a triplex are possible in principle, and the relative stability of each was 
studied (45-47), The results from different groups differ substantially, for 
reasons that are yet to be understood, (Though they may be attributed to 
differences in sequences and/or the ambient conditions.) However, all these 
studies show consistently that triplexes are more stable when DNA represents 
the central homopurine strand than when RNA does. Affinity cleavage data 
also indicate that the orientation of chemically homologous chains in hybrid 
triplexes is antiparallel. 

Fine Structure of DNA Triplexes 

The structural features of DNA triplexes have been studied using such diverse 
techniques as chemical and enzymatic probing, affinity cleavage, and electro- 
phoresis, all of which have provided insight into the overall structure of 
triplexes: (a) The third strand lies in the major groove of the duplex, as is 
deduced from the guanine N7 methylation protection (48, 49); (b) the orien- 
tation of the third strand is antiparallel to the chemically homologous strand 
of the duplex (11, 26); and (c) the duplex within the triplex is noticeably 
unwound relative to the canonical B-DNA (12). However, the fine structure 
of triplexes could not be elucidated at atomic resolution without more direct 
structural methods based on X-ray diffraction and NMR. 

The first attempt to deduce the atomic structure of a poIy(dT)»poly(dA> 
poly(dT) triplex using X-ray fiber diffraction was performed in 1974 (50). Two 
important parameters of the triple helix, an axial rise equal to 3.26 A and a 
helical twist of 30°, were directly determined from the fiber diffraction pat- 
terns. However, in an attempt to fit experimental data with atomic models, the 
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Table I Computed averages for various helical parameters 



Triplexes 

YR*Y YR*R 





X-ray 


NMR 


X-ray 


NMR 


B-DNA 


A-DNA 


Axial rise (A) 


3.3 


3.4 




3.6 


3.4 


2.6 


Helical twist (°) 


31 


31 




30 


36 


33 


Axial displacement (A) 


-2.5 


-1.9 




-1.9 


-0.7 


-5.3 


Glycosidic torsional angle 


and 


anti 




anti 




anti 


Sugar pucker 


C2' endo 


C2' endo 




C2 1 endo 


C2* endo 


C3' endo 



duplex within the triplex was erroneously concluded to adopt the A confor- 
mation (SO). Other studies of the atomic structure of triplexes have only 
recently corrected this widely accepted conclusion. NMR data (51) (see below) 
and infrared spectroscopy (52) convincingly demonstrated that the sugar 
pucker in all three strands within the triplex is of the S-type (a characteristic 
for B-DNA rather than A-DNA). It appeared that a B-like structure could 
nicely explain the original fiber diffraction data (Table 1). Moreover, this 
structure is stereochemically more favorable than is the original (53). 

Further sophisticated NMR studies have examined inter- and intramolecular 
triplexes of both YR*Y and YR*R types. This date unambiguously supported 
the major features of the YR*Y triplexes discussed above: (a) a requirement 
for cytosine protonation (54-56); (b) Hoogsteen basepairing of the third strand 
(32, 57); and (c) antiparallel orientation of the two pyrimidine strands (32, 57). 
The atomic structure of the triple helix, summarized in Table 1, was also 
determined (51, 58). The values of all major parameters determined by NMR 
are very close to those determined by fiber diffraction (53). Most significantly, 
the deoxyribose conformation of all strands in the triplex corresponds to an 
S-type (C2'-endo) pucker (51). It is clear from Table 1 that the duplex within 
the triplex adopts a B-like configuration; the helical twist in the triplex, how- 
ever, is significantly smaller than that for B-DNA. 

NMR studies of the YR*R triplexes (33, 59) showed that their overall 
structure is similar to that of YR*Y triplexes. The important difference, how- 
ever, is that reverse Hoogsteen basepairing of the third strand (as in Figure 3) 
was convincingly demonstrated. The helical parameters presented in Table 1 
are close to those of YR* Y triplexes and suggest the formation of an unwound 
B-like structure, A peculiar feature of the YR*R triplexes consisting of CG*G 
and TA*T triads is the concerted changes in the axial rise and helical twist 
along the helix axis (60). This was attributed to the lack of isomorphism 
between CG*G and TA*T triads, discussed above. 
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HForm 

Although the canonical Watson-Crick double helix is the most stable DNA 
conformation for an arbitrary sequence under usual conditions, some sequences 
within duplex DNA are capable of adopting structures quite different from the 
canonical B form under negatively superhelical stress (reviewed in 61). One 
of these structures, the H form, includes a triplex as its major structural element 
(Figure 4). Actually, there is an entire family of H-like structures (see 20 for 
a comprehensive review). 

The term *'H form" was proposed in a study of a cloned sequence from a 
spacer between the histone genes of sea urchin (62). It contained a d(GA) 14 
stretch hypersensitive to SI nuclease. Such S 1 -hypersensitive sites had been 
anticipated previously to adopt an unusual structure (63, 64), and numerous 
hypotheses had been discussed in the literature (63, 65-71). Using 2-D gel 
electrophoresis of DNA topoisomers (see 20), a structural transition was dem- 
onstrated without enzymatic or chemical modification (62). The pH depen- 
dency of the transition was remarkable: At acidic pH the transition occurs 
under low torsional tension, while at neutral pH it is almost undetectable. 
Because pH dependence had never been observed before for non-B-DNA 
conformations (cruciforms, Z-DNA, bent DNA, etc), the investigators con- 
cluded that a novel DNA conformation was formed. The structure was called 
the H form because it was clearly stabilized by hydrogen ions, i.e. it was a 
protonated structure. 

The H form model proposed in Ref. 8 (Figure 4A) consists of an intramo- 
lecular triple helix formed by the pyrimldine strand and half of the purine 
strand, leaving the other half of the purine strand single stranded. As Figure 
4A shows, this structure is topologically equivalent to unwound DNA. Two 
isoforms of H form are possible: one single stranded in the 5' part of the purine 
strand and the other single stranded in the 3' part. The existence of single- 
stranded purine stretches in H-DNA explains its hyperreactivity to S 1 nuclease. 
Canonical TA+T and CG*C + base triads stabilize the triple helix (Figure 2). 
The protonatlon of cytosines is crucial for the formation of CGC + base triads, 
which explains the pH dependency of the structural transition. 

The H-DNA model predicts that a homopurine-homopyrimidine sequence 
must be a mirror repeat to form H-DNA. It was convincingly demonstrated in 
Ref. 9 that mirror repeats indeed adopt the H form, while even single-base 
violations of the mirror symmetry significantly destabilize the structure. Chem- 
ical probing of H-DNA using conformation-specific DNA probes (reviewed 
in 72) provided final proof of the H-form model (25, 48, 49, 73-76). Notably, 
these studies revealed that different sequences preferentially adopt only one 
of the two possible isomeric forms of H-DNA, the one in which the 5' part of 
the purine strand is unstructured. 
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Figure 4 H-DNA menagerie. A. H-DNA model. Bold line, homopuiine strand; thin line, 
homopyrimidine strand; dashed line, the half of the homopyrimidine strand donated to the triplex. 
B. Two isoforms of "H-DNA. a Nodule DNA. D, Tethered loop. In B-D, solid line, homopurine 
strand; stippled Line, homopyrimidine strand. 
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The structural features responsible for the difference between the two 
isoforms have been identified in Ref. 77. The isoform with the 3' half of the 
pyrimidine strand donated to the triplex (designated H-y3) is preferentially 
formed at physiological superhelical densities. In this isoform, the 5' portion 
of the purine strand is single stranded, and its formation is consistent with the 
chemical probing results described above. The other isoform (in which the 5' 
half of the pyrimidine strand is donated to the triplex — designated H-y5) was 
only observed at low superhelical density. Topological modeling of H-DNA 
formation showed that the formation of the H-y3 isoform releases one extra 
supercoil relative to the H-y5 isoform. This explains why H-y3 is favorable at 
high superhelical density. Recent studies show that the mechanisms underlying 
preferential isomerization into the H-y3 conformation are more complex. Ap- 
parently, the presence of bivalent cations can make the H-y5 isoform preferable 
(78). What is more surprising, the loop sequence plays an important role in 
determining the direction of isomerization (79, 80). Systematic studies of 
factors contributing to isomerization are yet to be done. 

H-DNA Menagerie 

As for intermolecular triplexes, a menagerie of H-DNA-like structures exists 
(reviewed in 20). First, intramolecular YR*R triplex, called *H-DNA, was 
described in Refs. 25 and 81 (Figure 4B). This structure is also topologically 
equivalent to the unwound DNA and requires DNA supercoiling (82), As in 
intermolecular YR*R triplexes, A can be replaced with T (83) and t at acidic 
pH, G can be replaced with A (28) in the third strand of *H-DNA. Thus, the 
sequences adopting the *H form are not necessarily mirror repeated and not 
even necessarily homopurine-homopyrimidine (see 20 for comprehensive re- 
view). 

Two isoforms of *H-DNA are possible, designated H-r3 and H-r5 according 
to which half of the homopurine strand is donated to the triplex (Figure 4fl). 
Chemical probing with single-stranded, DNA-specific agents showed that H-r3 
isoform is dominant. 

As for all YR*R triplexes, the mechanisms of +H-DNA dependence on 
bivalent cations are unclear. Cation requirements are different for different 
sequences (20, 25, 27, 81, 84-87), For example, while *H-DNA formed by 
d(G) n «d(C)„ sequences is stabilized by Ca 2+ , Mg 2+ , and Mn 2+ , the same struc- 
ture formed by d(GA) n »d(TC)„ is formed in the presence of Zn 2+ , Mn 2 * Cd 2+ , 
and Co 2+ . The differences in cation requirements are due to variations in 
neighboring triads or changes in the GC content or both. Even moderate 
changes in GC content (from 75% to 63%) switched cation requirement from 
Mg 2+ to Zn 2+ for a particular sequence (22). A Mg 2t -to-Zn 2+ switch was 
reported to affect the equilibrium between H-r5 and H-r3 isoforms (86) or to 
substantially modify the *H-structure (87). 
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A hybrid of H and *H forms was described, called nodule DNA (88, 89) 
(Figure 4Q. Nodule DNA is an analog of the intermolecular alternate-strand 
triplexes described above. 

A peculiar H-like structure formed by two distant homopurine-homopyrim- 
idine tracts was described in Rcf. 90. It is in a way similar to an early model 
for SI hypersensitivity in the human thyroglobulin gene (69). It was found 
that linear DNA containing both tracts at pH 4.0 and in the presence of 
spermidine migrates very slowly in an agarose gel. This abnormal electropho- 
retic mobility was attributed to the formation of a so-called tethered loop 
(Figure 4D). In this structure, the homopyrimidine strand of one stretch forms 
a triplex with a distant stretch, while its complementary homopurine strand 
remains single stranded Supporting this model, it was found that the addition 
of excess homologous homopyrimidine, but not homopurine, single-stranded 
DNA prevented loop formation. Though the mechanism of tethered loop for- 
mation is not self-evident, it is allowed topological^. Chemical probing is 
required to prove the existence of this structure definitively. 

Specificity of Triplex Formation 

The specificity and stringency of triplex formation (35) has attracted serious 
attention for two reasons. First, the formation of triplexes is limited to the 
homopurine-homopyriimdine sequences or to sequences composed of adjacent 
oligopurinc/oligopyrimidine clusters. This major limitation to the biological 
and theurapeutic applications of triple-helical DNAs prompted an extensive 
search for DNA bases that could be incorporated into the third strand of a 
triplex in order to recognize thymines or cytosines in the otherwise homopurine 
strand of the duplex. Secondly, accurate knowledge of the specificity of third- 
strand recognition for perfect homopurine-homopyrimidine sequences is nec- 
essary in order to target natural DNAs. 

The quest for such knowledge stimulated the study of non-orthodox triads. 
So far most of the data have been collected for YR*Y triplexes, including all 
14 noncanonical triads (other than CG*C and TA*T). One approach was to 
analyze the influence of mismatched triads on H-DNA formation using 2-D 
gel electrophoresis (91). Stability of mismatched triads in intermolecular tri- 
plexes was studied using affinity cleavage (92), melting experiments (93, 94), 
and NMR (95). These studies agreed that although single mismatches could 
be somewhat tolerated, each mismatch significantly disfavored triplex. The 
mismatch energies were within the range of 3-6 kcal/mol, i.e. similar to the 
cost of B-DNA mismatches. Thus, homopyrimidine oligonucleotides form 
triplexes with target sequences at a specificity comparable to that Been in 
Watson-Crick complementary recognition. 

High sequence specificity of third-strand recognition of homopurine-homo- 
pyrimidine sequences in the duplex makes TFOs very attractive candidates for 
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targeting genomic DNA. Supporting this conclusion, homopyrimidine TFOs 
equipped with Fe*EDTA have been demonstrated to cleave unique sites in 
yeast (96, 97) and human (98) chromosomes. They were also found to be 
convenient tools for affinity capture of human genomic targets (99). 

However, studies widely disagreed on the relative stability of individual 
noncanonical triads. For example, the AT*G triplet was shown to be the most 
favorable in studies of intermolecular triplexes (92, 93), but it is not among 
the best for H-DNA (91). This contradiction could be due to the different 
triplex-forming sequences studied by different groups, since heterogeneity in 
stacking interactions within a triple helix must seriously affect its stability. 
This idea was recently supported by NMR studies of the AT*G triad (58, 100). 
It was found that guanine in this triplet is tilted out of the plane of its target 
AT basepalr to avoid a steric clash with the thymine methyl group. This causes 
a favorable stacking interaction between this guanine and the thymine flanking 
it from the 5'-side, which is likely to be a major determinant of AT*G triplet 
stability. This also explains the differences between the Inter- and intramolec- 
ular triplex studies: In the first case, guanine was flanked by thymine on the 
5' end (92), while in the second case, it was flanked by a cytosine (91). Thus, 
the favorable stacking interaction was absent in the intramolecular triplex, and 
the AT*G triad was relatively unstable. Recently, it was shown directly that 
replacement of the TA*T triad on the 5' side of guanine with a CG*C triad 
reduces the stability of TA*G triplet (101). The clear message from these 
results is that the influence of nearest neighbors on triad stability must be 
studied to better understand the duplex-to-triplex transition. This doughty goal 
is not yet achieved. 

Notwithstanding the difficulties discussed above, empirical rules for target- 
ing imperfect homopurine-homopyrimidine sequences were suggested in Ref. 
102. If the homopurine strand of a duplex is interrupted by a thymine or 
cytosine, it must be matched by a guanine or thymine, respectively, in the third 
strand. However, this expansion of the third-strand recognition code Is prema- 
ture, as was recently addressed (103). The GC*T triad, though reasonably 
stable, is dramatically weaker than the canonical TA*T triad. Thus, a TFO 
containing a thymine, intended to interact with a cytosine in the target, would 
bind significantly better to a different target containing adenine in the corre- 
sponding position. In the AT*G case, the triad specificity is high, but the 
affinity of the G for the TA pair is only modest. 

Another approach to overcoming the homopurine-homopyrimidine target 
requirements is to incorporate artificial DNA bases within the third strand. 
Several studies found that non-natural bases, such as 2'-deoxynebularine or 
2'-deoxyformycin A and others, may form very stable triads with cytosines 
and thymines intervening the homopurine strand (94, 104, 105). It is yet to be 
seen if the specificity and stringency of such complexes is sufficient. 
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Limited data are available on the mismatched triads in YR*R triplexes. By 
use of affinity cleavage experiments, all 13 noncanonical triplets (all combi- 
nations except CG*G, TA*A, and TA*T) were shown to disfavor triplex 
formation (106). The only notable exception is the CG*A triad, which is 
favorable under acidic pH due to the protonation of its adenine (28). Much as 
with homopyrimidine TFOs, p urine-rich TFOs can be used specifically to 
target homopurme-homopyrimidine sequences in natural DNAs. 

Stabilization of Triplexes 

The stabilization of DNA triplexes is particularly important for any possible 
biological applications. As discussed above, the YR*Y triplexes are formed 
under acidic pH, while YR*R triplexes require millimolar concentrations of 
bivalent cations. Physiological pH, however, is neutral, and a high concentra- 
tion of unbound bivalent cations in a cell is unlikely. Thus, numerous studies 
have been aimed at the stabilization of DNA triplexes at physiological condi- 
tions. 

Most of the YR* Y triplexes studies have been concentrated on overcoming 
pH dependency. The most promising results show that polyamines, specifically 
spermine and spermidine, favor both inter- and intramolecular YR*Y triplexes 
under physiological pH (11, 107, 108). The stabilizing effect is likely due to 
decreased repulsion between the phosphate backbones after binding to polya- 
mines, overcoming the relatively high density of a negative charge in triplexes. 
The millimolar polyamine concentrations found in the nuclei of eukaryotic 
cells (reviewed in 109) raise the hope for triplexes in vivo. 

The requirement for cytosine protonation could be overcome by several 
chemical means. The incorporation of 5-methylcytosines instead of cytosines 
in TFOs increases the stability of YR*Y triplexes at physiological pH (110, 
111), but more detailed study found that this effect is relatively small (the 
apparent methylation-induced ApK, is only 0.5) (112). Another solution is to 
substitute cytosines in the third strand with non-natural bases that do not require 
protonation for Hoogsteen hydrogen bond formation. Indeed the substitution 
of cytosines with N 6 -methyl-8-oxo-2-deoxyadenosines (113), pseudoiso- 
cytidines (114), 7,8 dihydro-8-oxoadenines (115), or 3-methyl-5-amino-lH 
pyrazolo [4.3-d] pyrimidln-7-ones (116) led to pH-independent triplex forma- 
tion, 

Intermolecular triplexes could be additionally stabilized if the third strand 
represented an oligodeoxynucleotide-intercalator conjugate. This was first 
demonstrated for a homopyrimidine oligonucleotide linked with an acridine 
derivative (10) and later shown for other oligonucleotides and intercalators 
(117-119). The stabilization is due to the intercalation of a ligand into DNA 
at the duplex-triplex junction. For reasons that are yet unclear, the most stable 
complex is formed when the intercalator is attached to the 5' end of the TFO. 
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Particularly promising for gene targeting is an oligonucleotide-psoralen con- 
jugate, as near-UV irradiation of a triplex formed by such a conjugate leads 
to crosslink formation, making the triplex irreversible (120). 

An independent line of research has sought for triplex-specific llgands. One 
such ligand, a derivative of benzo[e]pyridoindole (BePI), has been described 
in Refs. 121 and 122. BePI shows preferential intercalation into a triple- rather 
than double-helical DNA, thus greatly stabilizing triplexes (122). Another 
promising triplex-binding ligand is coralyne (123). 

It should be emphasized that, to be prospective drugs for gene targeting, 
TFOs must meet two requirements: They must bind their targets relatively 
strongly and not target other sequences. If a TFO has very strong affinity to 
its target, it can also bind to a site with one or even more mismatches. This 
should be especially true for non-sequence-specific stabilization of triplexes 
with intercalating drugs attached to TFOs. Therefore, increased stability inev- 
itably entails decreased selectivity of the TFO. It is not at all accidental that 
the spectacular demonstration of sequence-selective cutting of genomic DNA 
with TFOs was achieved under conditions of extremely weak binding of the 
TFO to its target site (96-98). Systematic experimental study of sequence 
selectivity of all modified TFOs mentioned above is still lacking. However, it 
is obvious that these modified TFOs should exhibit poorer selectivity than do 
the original TFOs. 

The stabilization of intramolecular triplexes could be achieved in several 
ways, the most obvious of which is to increase the negative superhelical 
density, since the formation of H-DNA releases torsional stress. As is discussed 
below, the increase of negative supercoiling does provoke triplex formation 
in vivo. The polyamine stabilization of H-DNA at physiological pH has already 
been mentioned. 

A less obvious way of stabilizing H-DNA, called kinetic trapping, was 
described (124). It was found that oligonucleotides complementary to the 
single-stranded homopurine stretch in H-DNA stabilized H-DNA under neutral 
pH, where H-DNA alone rapidly reverts to the B conformation. 

Peptide Nucleic Acid (PNA) 

PNA is the prototype of an entire new class of TFO-based drugs that interact 
with DNA in a manner unlike that of ordinary TFOs. PNA (Figure 5A) was 
designed in the hope that such an oligonucleotide analog containing normal 
DNA bases with a poly amide (i.e. proteinlike) uncharged backbone would 
form triplexes with double-stranded DNA (dsDNA) much more efficiently 
than do the regular TFOs (14). 

Instead of forming triplexes with duplex DNA, the first studied homothym- 
ine PNA oligomer, PNA-Tjo, opened the DNA duplex in A/T a tracts, forming 
an exceptionally strong complex with the A strand and displacing the T strand 
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Figure 5 A. The chemical structures of PNA and DNA. A P-loop formation. Bold line, DNA; 
stippled line, PNA. 



(IS, 125, 126). At the same time, model experiments with complexes formed 
between PNA oligomers and oligonucleotides revealed that, while PNA/DNA 
heteroduplexes are not much more stable under ordinary conditions than are 
DNA/DNA homoduplexes (127), two homopurine PNA oligomer molecules 
form exceptionally stable triplexes with the complementary homopurine oli- 
gonucleotide (128, 129). 

These results strongly suggest an unusual mode of binding between the 
synthetic analog and dsDNA. Namely, two horoopyrimidine PNA molecules 
displace the duplex DNA pyrimidine strand and form a triplex with the purine 
strand of DNA (15, 16, 130, 131). These complexes are called the P-loops 
(Figure 5B). 

The P-loop is a radically different complex than that formed between duplex 
DNA and ordinary TFOs. Although the fact of (PNA>2/DNA triplex formation 
during the strand-displacement reaction has been convincingly proven (16, 
130, 131), the mechanism of P-loop formation remains to be elucidated. The 
available data indicate that the reaction most probably proceeds via a short- 
lived intermediate, which consists of one PNA molecule complexing with, the 
complementary DNA strand via Watson-Crick pairing. This intermediators 
formed due to thermal fluctuations (breathing) of the DNA duplex (132, 133). 
It is very unstable and would dissociate if it were not fixed by the second PNA 
oligomer in a (PNA)2/DNA triplex leading to P-loop formation (see Figure 
SB). This triplex is remarkably stable. 

PNA forms much more stable complexes with dsDNA than do regular 
oligonucleotides. This makes PNA very promising as an agent tor sequence- 
specific cutting of duplex DNA (16), for use in electron-microscopy mapping 
of dsDNA (15), and as a potential antigene drug (134, 135), as PNA is 
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remarkably stable in biological fluids in which normal peptides and oligonu- 
cleotides are quickly degraded (136). 

However, serious limitations for various applications of FN A still remain. 
P-loop formation proceeds through a significant kinetic barrier and strongly 
depends on ionic conditions (15, 16, 123, 126). This dependency, if not by- 
passed, poses significant limitations on possible sequence-specific targeting of 
dsDNA by PNA under physiological conditions, Although the stringency of 
(PNAVDNA triplexes is not yet known, PNA should still target predominately 
homopurine-homopyrimidine regions, just as do regular TFOs. 

BIOCHEMISTRY OF TRIPLEXES 

Formation and Possible Functions of H-DNA In Vivo 

As is true for other unusual DNA structures, such as cruciforms, Z-DNA, and 
quadroplexes, the biological role of H-DNA is yet to be established. Two 
important problems must be addressed: (a) Can H-DNA be formed In cells in 
principle? (b) In which biological process if any is H-DNA involved? Recently 
it became clear that the answer to the first question is yes. There are currently 
many hypotheses on the role of H-DNA in DNA replication, transcription, and 
recombination, but more studies are needed to answer the second question. 

Sequences that can form H-DNA are widespread throughout the eukaryotic 
genomes (137, 138) but are uncommon among eubacteria. However, direct 
detection of H-DNA in eukaryotic cells is very difficult because of the com- 
plexity of genomic DNA. Therefore, most of the studies on the detection of 
H-DNA in vivo exploited Escherichia colt cells bearing recombinant plasmids 
with triplex-forming inserts as convenient model systems. Chemical probing 
of intracellular DNA proved helpful for the detection of H-DNA in vivo. 
Certain chemicals, such as osmium tetroxide, chloroacetaldehyde, and psora- 
len, give a characteristic pattern of H- or *H-DNA modification in vitro. 
Conveniently, they, can also penetrate living cells. Thus, the general strategy 
for detecting H-DNA in vivo was to treat E. coli cells with those chemicals, 
isolate plasmid DNA, and locate modified DNA bases at a sequence level. 
The coincidence of modification patterns in vitro and in vivo basically proved 
the formation of the unusual structure in the cell. 

Using this approach, the formation of both H- and +H-DNA was directly 
shown (139-141). The corresponding studies were reviewed in Ref. 20, but 
we briefly summarize the major findings. All these studies agreed that the level 
of DNA supercoiling in vivo is the major limiting factor in the formation of 
these structures. Though transient formation of H-DNA was observed in nor- 
mal exponentially growing E, coll cells (141), formation of H-DNA was much 
more pronounced when intracellular DNA supercoiling increased, due to mu- 



TRIPLEX DN A 83 



tations in the gene for Topo I (141) or due to treatment of cells with chloram- 
phenicol (139, 140). Environmental conditions during & coli growth also 
significantly contributed to the appearance of triplexes. H-DNA formation was 
greatly enhanced when cells were growing in mildly acidic media, which 
somewhat decreased intracellnlar pH (139, 141) while *H-DNA was observed 
in cells growing in media with a high concentration ofMg 2 * ions (140). Neither 
result Is surprising, because H-DNA Is stabilized by protonadon while *H- 
DNA is stabilized by bivalent cations. 

Besides the steady-state level of DNA stnpercoiling, determined by the 
balance of DNA gyrase and Topo I (reviewed in 142), the local level of 
supercoiling strongly depends on transcription. During the process of poly- 
merization the RNA polymerase creates domains of high negative and positive 
supercoiling upstream and downstream of it, respectively (143), which may 
influence the formation of unusual DNA structures (144, 145). Chemical 
probing of intracellular DNA demonstrated transcriptionally driven formation 
of *H-DNA within long d(G) D , d(C) I j stretches located upstream of a regulated 
promoter in an E. coli plasmid (146). Remarkably, the formation of ♦H-DNA 
stimulated homologous recombination between direct repeats flanking the 
structure. Thus, this work shows the formation of *H-DNA under completely 
physiological conditions in a cell, and implicates it in the process of recom- 
bination. 

The only data on triplex DNA detection in eukaryotic cells were obtained 
using antibodies against triple-helical DNA (147). These antibodies were found 
to interact with eukaryotic chromosomes (148, 149). 

Many ideas have been proposed involving H-DNA in such basic genetic 
processes as replication and transcription. The hypothesis regarding H-DNA 
in replication is based on the observation that triplex structures prevent DNA 
synthesis in vitro. On supercoiled templates containing *H-DNA, DNA syn- 
thesis prematurely terminates. The location of the termination site is different 
for different isoforms of *H-DNA, but it always coincides with the triplex 
boundaries as defined by chemical probing (83). 

More peculiarly, H-like structures can be formed in the process of DNA 
polymerization and efficiently block it Two such mechanisms were demon- 
strated experimentally (Figure 6A.B). It was found that d(GA) fl or d(OT)„ 
inserts within single-stranded DNA templates cause partial termination of 
DNA polymerases at the center of the insert (21, ISO). It was suggested that 
when the newly synthesized DNA chain reaches the center of the homopolymer 
sequence, the remaining homopolymer stretch folds back, forming a stable 
triplex (Figure 6A). As a result, the DNA polymerase finds itself in a trap and 
is unable to continue elongation. 

In open circular DNA templates, H-like structures are absent due to the lack 
of DNA supercoiling. It was shown, however, that T7 DNA polymerase ter- 
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Figure 6 DNA polymerase-driven triplex formation blocks polymerizatioa Black boxes, the two 
halves of a hornopurine-homopyrimldine mirror repeat involved in the formation of an 
intramolecular triplex; striated arrow, the newly synthesized DNA chalnM. Single-stranded DNA 
template. B. Double-stranded DNA template, 



minated exactly at the center of *H-forrning sequences. This was observed 
when the pyrimidine-rich but not the purine- rich strand served as a template 
(22). To explain this one must remember that DNA synthesis on double- 
stranded templates is possible due to the ability of many DNA polymerases to 
displace the nontemplate DNA strand (reviewed in 151). The displaced strand 
may fold back, promoting the formation of an intramolecular triplex down- 
stream of the replication fork at an appropriate sequence. Conditions for DNA 
synthesis in vitro— i.e. neutral pH and high magnesium concentration — are 
optimal for the formation of YR*R triplexes. Thus, the displacement of the 
purine-rich (but not the pyrimidine-rich) strand provokes triplex formation 
which, in turn, leads to termination of DNA synthesis (Figure 65). 

There are only fragmentary data on the role of H motifs in the regulation 
of replication in vivo. Several homopuiine-homopyrimidlne inserts were 
shown to decrease the efficiency of Simian virus 40 (S V40) DNA replication 
(152, 153). Quite recently, the pausing of the replication fork in vivo within 
a d(GA) n »d(TC) fl insert in SV40 DNA was demonstrated directly using a 
technique called two-dimensional neutral/neutral gel electrophoresis (23). 
Though these data make the idea of H-DNA involvement in the regulation of 
replication promising, it is far from proven, Future studies are crucial for the 
evaluation of this hypothesis. 

Numerous studies concerned the possible role of H-DNA in transcription. 
Deletion analysis of various promoters — including Drosophila hsp26 (154, 
155); mouse c-Ki-ra* (156) and TGF-p3 (157); human EGFR (158), ets-2 
(159), IR (160), and c-wyc (161, 162); and others— showed that homopurine- 
homopyrimidine stretches are essential for promoter functioning. 

These sequences serve as targets for nuclear proteins, presumably transcrip- 
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tionai activators. Several homopurine-homopyrimidine DNA-binding proteins 
were described, including BPG1 (163), NSEP-1 (164), MAZ (163), nm23-H2 
(166), PYBP (167), Pur-1 (168), etc. Peculiarly, these proteins often bind 
preferentially to just one strand of the H motifs. For example, a number of 
mammalian proteins specifically recognize homopurine-homopyrimidine se- 
quences in the double-helical state as well as the corresponding homopyrirnid- 
ine single strands (164, 167, 169, 170). This unusual binding pattern may 
dramatically influence the equilibrium between different DNA conformations 
in the promoter in vivo. 

However, the importance of the H structure for transcription was questioned 
in several studies. One approach is to analyze the influence of point mutations 
within H motifs that destroy or restore H-forming potential on the promoter's 
activity. No such correlation was observed for Drosophila hsp26 (155) and 
mouse c-Kl-ras (171) promoters. The situation with the c-myc promoter is 
more complex, since it is unclear if the canonical H-DNA or some other 
structure is formed even in vitro (172). Mutational analysis of the promoter 
gave contradictory results, with one group claiming the existence (173) and 
another the lack (174) of a correlation between structural potential and pro- 
moter strength. Another approach to detecting H-DNA in eukaryotic promoters 
is direct chemical probing followed by genomic sequencing. So far, this has 
only been done for the Drosophila hsp26 gene, and H-DNA was not observed 
(155). 

It is hard to completely rule out the role of H-DNA in transcription based 
on the above results. First, it is quite possible that the structural peculiarities 
of promoter DNA segments may affect the interaction between promoter DNA 
and specific regulator proteins. The features of homopurine-homopyrimidine 
DNA-binding proteins described above as well as a report about the partial 
purification of a triplex-binding protein (175) indirectly support this idea. A 
study In which the influence of d(G) a stretches of varying length on the activity 
of a downstream minimal promoter was analyzed additionally supports this 
hypothesis (176). A clear reverse correlation between the ability of a stretch 
to form the *H configuration in vitro and its ability to activate transcription 
in vivo was observed. It was concluded, therefore, that short d(G) n stretches 
serve as binding sites for a transcriptional activator, while longer stretches 
adopt a triplex configuration, which prevents activator binding. Secondly, 
negative data on the role of H-DNA in transcription were obtained in transient 
assays, while it can actually work at a chromosome level. Indeed, H motif in 
the Drosophila hsp26 gene was found to affect the chromatin structure (177, 
178). 

Despite the wealth of data and hypotheses, there is no direct evidence that 
the structural features of H motifs are involved in transcriptional regulation in 
vivo, and further studies are required to address this issue. 
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Targeting Basic Genetic Processes Using TFOs 

Highly sequence-specific recognition of double-helical DNAs by TFOs is the 
basis of an antigene strategy (reviewed in 13). The idea is that binding of a 
TFO to a target gene could prevent its normal functioning. Most studies of 
this strategy concerned the inhibition of transcription; the studies were inspired 
in part by the existence of functionally important homopurine-horaopyrimidine 
stretches in many eukaryotic promoters (see the previous section), which are 
appropriate targets for TFOs. The antigene strategy could potentially lead to 
rational drug design. Very convincing data on the inhibitory effects of TFOs 
were obtained in various in vitro systems. There are also preliminary indica- 
tions that TFOs may function in vivo as well. 

The first stage that is arfccted by TFOs is the formation of an active promoter 
complex. Pioneering results were obtained for the human c-myc promoter, 
where it was found that the binding of a purine-rich TFO to the imperfect 
homopurine-homopyrimidine sequence 125 basepairs (bp) upstream of the PI 
promoter start site blocks its transcription in vitro (179). The TFO's target is 
important for c-myc transcription, serving as a binding site for a protein(s), 
presumably a transcriptional activator (161, 162). At least two candidate genes 
coding for proteins that bind to this target have been cloned and sequenced 
(164, 166). Similar observations were made for the methallothionein gene 
promoter. In this case a homopyrimidine oligonucleotide formed a triplex with 
the upstream portion of the promoter, preventing the binding of the transcrip- 
tional activator Spl (111). This in turn drastically reduced the promoter's 
activity in a cell-free transcription system (179a). TFOs were also shown to 
prevent SP1 binding to the human DHFR (180) and H-ras (181) promoters. 
Finally, a triplex-forming oligonucleotide-intercalator conjugate was shown to 
act as a transcriptional repressor of the interleukin-2 receptor a gene in vitro 
(182), preventing the binding of the transcriptional activator NFkB. In all these 
cases TFOs efficiently blocked the access of the transcription factors to their 
binding sites. 

TFOs also inhibit initiation of transcription by RNA polymerases. The 
pBR322 Wa-gene contains a 13-bp homopurine-homopyrimidine target just 
downstream of the transcriptional start site. A 13-mer homopyrimidine oligo- 
nucleotide forming an intermolecular triplex with this target hindered initiation 
of transcription by E. coli RNA polymerase in vitro (183). Independent studies 
showed that this is also the case for T7 RNA polymerase (184). 

Finally, eukaryotic RNA polymerase II transcription was followed in vitro 
from the adenovirus major late promoter (185). The transcribed portion of 
DNA contained a 15-bp homopurine-homopyrimidine tract that formed an 
intermolecular triplex wjth the homopyrimidine TFO. When added prior to 
RNA polymerase, the TFO truncated a significant portion of the transcripts. 
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Thus, TFOs can block transcription at different stages: promoter complex 
formation, initiation, and elongation. This appears to be true for both pro- and 
eukaryotic RNA polymerases. TFOs can be considered to be artificial repres- 
sors of transcription ( 1 86). 

There is a growing number of indications that TFOs may act as repressors 
of transcription in cell cultures as well. The most convincing results so far 
were obtained for the interieukin-2 receptor a promoter (182, 187). Hamo- 
pyrimidine TFOs were designed to overlap a target site and prevent binding 
of the transcriptional activator NFkB. They were conjugated with acridine to 
stabilize the triplexes, or psoralen to make triplex formation irreversible after 
UV irradiation. The plasmld bearing the reporter gene under the control of the 
EL-2Ra promoter was cotransfected with these TFOs in tissue cultures, where 
it was. shown that TFOs block promoter activity in vivo. Particularly strong 
inhibition was observed after UV irradiation of cells transfected with psoralen 
conjugates. In the latter case, chemical probing directly demonstrated the 
formation of intermolecular triplex in vivo. A similar cotransfection approach 
was also used to target Interferon Responsive Elements in vivo (188). 

A different approach was used in several studies where p urine-rich TFOs 
were added to the growth media of cells containing target genes. To prevent 
oligonucleotides from degrading, their 3' ends were protected by an amino 
group (189). Such oligonucleotides accumulated within cells and could be 
recovered in intact form. Partial transcriptional inhibition of human c-myc and 
IL2Ra genes by such TFOs has been reported (189, 190). Similar effects were 
observed for human immunodeficiency virus (HIV) transcriptional inhibition 
in chronically infected cell lines (191). Using cholesterol-substituted TFOs, 
the progesterone-responsive gene has also been inhibited (192). Though the 
inhibitory effect was never more than 50%, it is quite remarkable considering 
that a short oligonucleotide must find its target in an entire genome and prevent 
its proper interaction with cellular transcriptional machinery. Note, however, 
that in none of those cases was the formation of triplexes directly demonstrated 
Other mechanisms of oligonucleotide-caused transcriptional inhibition must 
be ruled out in the future. 

The use of TFOs for DNA replication inhibition is less studied. In vitro 
formation of putative intramolecular triplexes or H-like triplexes (see Figure 
1) on single-stranded DNA templates traps many different DNA polymerases 
(22, 193). Purine-rich TFOs are particularly efficient even against such pro- 
cessive enzymes as T7 DNA polymerase and thermophilic Taq and Vent 
polymerases, because the conditions of DNA synthesis in vitro are favorable 
for YR*R triplexes. Pyrimidine-rich TFOs must be additionally cross! inked to 
the target to cause inhibition (194). TFOs also block DNA polymerases on 
double-stranded templates (195). The inhibition of DNA synthesis in vitro was 
observed not only when triplexes blocked the path of DNA polymerase, but 
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also when a polymerization primer was involved in triplex formation (193). 
Single-stranded DNA-binding protein (SSB protein) helped DNA polymerases 
partially overcome the triplex barrier, but with an efficiency dramatically 
dependent on the triplex configuration. 

Though these observations make TFOs promising candidates for trapping 
DNA replication in vivo, there are almost no experimental data regarding this. 
The only published data concern the use of an octathymidilate-acridine con- 
jugate, which binds to a d(A) 8 stretch in SV40 DNA adjacent to the T anti- 
gen-binding site. In vivo it partially inhibits SV40 DNA replication, presuma- 
bly by interfering with the DNA binding or with unwinding activities of the 
T antigen (196). 

The major problem with the use of TFOs is in matching high sequence 
selectivity with binding that is sufficiently strong to interfere with genetic 
processes. Under physiological conditions, TFOs bind weakly to their targets, 
which by itself favors a high sequence selectivity. However, to significantly 
affect genetic processes, the TFO must be rather long, which limits the number 
of potential targets, as such long homopurine-homopyrimidine stretches are 
infrequent 

Three-Stranded DNA Complexes in Homologous 
Recombination 

In this section we briefly discuss a still poorly understood three- stranded DNA 
complex, formed by RecA protein and, possibly, recombinant proteins from 
other sources. RecA protein is well known to exhibit many enzymatic activities 
essential for recombination (reviewed in 18, 197). The main function of RecA 
protein in recombination is to exchange single-stranded DNA (ssDNA) strand 
with its homolog in dsDNA. The sequential stages of this reaction are: (a) 
cooperative assembly of RecA protein molecules on the ssDNA, leading to 
the formation of a right-helical nucleoprotein filament called the presynaptic 
complex, (b) synapsis, i.e. the formation of a complex between this filament 
and the homologous dsDNA, and (c) the actual strand exchange, which requires 
ATP hydrolysis. Strand exchange proceeds in only one direction: The displace- 
ment of a linear single-stranded product starts from its 5' end. 

The synapsis step requires searching for homology between the presynaptic 
filament and the target dsDNA. One way to do so is to use Watson-Crick 
complementarity rules. However, this requires a partial strand separation of 
the dsDNA, resulting in the formation of a so-called D-loop, In this structure, 
one of the DNA strands of the duplex is displaced, while the other is involved 
in Watson-Crick pairing with incoming ssDNA. An alternative, very attrac- 
tive possibility, first postulated in Refs. 17 and 198, does not require dsDNA 
strand separation and invokes triplex formation. This hypothetical type of 
DNA triplex was later called "recombination," "parallel/* or R-DNA (19, 
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199, 200). These names emphasize two fundamental differences between this 
hypothetical triplex and the well-characterized orthodox DNA triplexes de- 
scribed in other sections of this chapter. First, chemically homologous DNA 
strands are parallel in R-DNA but antiparallel in standard triplexes. Secondly, 
any sequence can adopt an R-DNA conformation, while homopurine-homo- 
pyrimidine stretches are strongly preferable in adopting standard triplex 
structure. 

In important experiments on strand exchange between the partially homol- 
ogous substrates (201, 202), three types of joint molecules were observed. In 
the case of proximal joints, the area of homology is situated at the 5' end of 
the outgoing duplex strand, i.e. both synapsis and strand exchange are possible. 
For a distal joint (with homology at the 3' end of the outgoing strand), RecA 
cannot drive strand exchange. Medial joints contain heterologous regions at 
both ends of the dsDNA, making strand exchange from any DNA end impossi- 
. ble. Since synaptic complexes were detected in all three cases, it became clear 
that synapsis and strand exchange are not necessarily coupled. When synaptic 
complexes — in particular the medial complexes — were treated with DNA 
crosslinking agents, crosslinks were observed between all three DNA strands 
involved in the complex (203), indicating a close physical proximity of the 
three strands. 

Analysis of distal joints with very short (38-56-bp) regions of homology 
showed that they are remarkably stable upon the removal of RecA protein 
(199). In fact, joint molecules dissociated at temperatures indistinguishable 
from the melting temperatures of DNA duplexes of the same length and 
sequence. In spite of its stability, however, the complex did not form sponta- 
neously without recombination proteins. The conclusion was that RecA and 
related proteins promote the formation of a novel "recombinant" DNA triplex, 
which otherwise cannot form, presumably due to a kinetic barrier of unknown 
nature. Independent studies confirmed the extreme stability of deproteinized 
distal joints with longer regions of homology (204). The basepairing scheme 
for R-DNA involving triplets for arbitrary DNA sequences was suggested in 
Refs. 19 and 205. The unique feature of these triplets is the interaction of the 
third strand with both bases of the Watson-Crick pair. 

Although the above data seem to be most consistent with the idea of a 
"recombination" triplex formation, a careful analysis of three-stranded com- 
plexes formed under RecA protein (206) using chemical probing indicates that 
basepairing in the parental duplex is disrupted. The incoming ssDNA appears 
to form W-C pairs with the complementary strand of the duplex. It was 
concluded that the synapsis is accompanied by local unwinding, leading to the 
formation of D- loop-like structures, rather than the 'recombination" triplexes 
(206). This conclusion was supported by the data that the N7 position of 
guanines, which is involved in Hoogsteen hydrogen bonding in all known 
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triplexes in vitro (see Figure 2), is not required for the formation of three- 
stranded complexes by RecA protein (207). 

Thus, the putative triplex between the incoming single strand and the duplex 
systematically avoids detection. Nevertheless, a more general question remains 
whether the "recombination" triplexes can be formed in principle, even if they 
do not play any role in recombination. This kind of triplex has recently been 
claimed for a postsynaptic complex formed between the outgoing single strand 
and the duplex yielded as a result of the strand exchange (208). 

Quite recently it was suggested that a specifically designed oligonucleotide 
could fold back to form an intramolecular R-like structure without the assis- 
tance of any proteins (209). The main argument is that the thermal denaturation 
curves are biphasic, which was interpreted as subsequent triplex-to-duplex and 
dupiex-to-single strand transitions. This is hardly a sufficient argument, and 
data on the chemical and enzymatic probing of such complexes provided in 
the same study do not support the claim. 

In the absence of conclusive evidence, the existence of "recombination" 
triplexes, or R-DNA, remains doubtful, One of the most uncomfortable ques- 
tions is the extreme thermal stability of deproteinized distal joints described 
in Refs. 199 and 204. None of the proposed models can satisfactorily explain 
this feature. It is totally unclear what is the nature of the kinetic barrier that 
prevents the formation of R-DNA by dsDNA and homologous oligonucleotide 
without any protein. It is also unclear why the medial joints, unlike the distal 
joints, are unstable upon deproteinization (203, 210). Additional concern is 
possible exonuclease contamination of the RecA protein and SSB protein 
preparations used for strand transfer reaction. At least in one case, such con- 
tamination was admitted to be responsible for the formation of distal junctions 
(211). In both original papers (199, 204), the authors claimed the lack of 
nuclease contamination. As shown in (211), however, exonuclease I (Exol) is 
enormously activated by SSB protein. As a result, the levels of Exo I required 
to generate the reverse strand exchange are extremely low (1 molecule of Exo 
I per 20,000 molecules of RecA protein). In the light of these new findings, 
it seems possible that distal joints, which were as stable as duplex DNA, might 
actually be duplexes formed after SSB-activated trace contamination of Exo I 
digested the nonhomologous strand from its 3' end. 

Even in the absence of a clear understanding of the structure of three- 
stranded joints promoted by RecA protein, they have already found interesting 
applications in gene targeting. The first example is called RARE, for RecA- 
Assisted Restriction Endonuclease cleavage (210). The rationale for this ap- 
proach is that since RecA protein can form three-stranded complexes between 
dsDNA and oligonucleotides as short as 15 nucleotides (212), such complexes 
can be used to block specific methylatlon sites in dsDNA. After the removal 
of proteins and consequent dissociation of the three-stranded complexes, cleav- 
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age by methylase-sensitive restriction endonuclease is limited to the targeted 
site. Thus, one can cleave large DNAs at a unique site or, using pairs of 
oligonucleotides, separate specific DNA fragments from the genome. 
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i tun tn »!.•• dv<" • i • :f ^:itli fluorescence 
« 0. !S r ;W> I Ik- addition of 

ImP, tn extruded nucleoplasm did not 
ch;inj,v the calcium «rven fluorescence (n = 
M. itulic:itinu that the hick of fluorescence 
from the nucleus after ln>P, application was 
not caused hy a direct interaction k'lwwn 
component* of the nucleoplasm and InsP,. 
These results eliminate ihe pt»»ihitity ili:it 
l he nuclear mutrix was necessary for dye 
exclusion and surest that the envelope 
itself regulates transit of intermediate-si:cd 
(10 IcP) molecules. 

Although inteimediate-si:cd molecules 
of -10 kO were excluded hy the nuclear 
envelope when the nuclear cisterna C;r* 
More uns emptied, smaller molecules were 
not. Intact nuclei were Qr* -depicted hy 
: ncuh:ition in 10 nM Or* solution with I 
u.M ln-.r, for 1 min. Nuclei were exposed 
to ;i hue molecular weight form of Lucifer 
yellow (500 daltons). Within 2 min after 
the addition of Lucifer yellow, nuclei had 
roughly the same fluorescence intensity a> 
the hath (nuclear/hath = 0.S4 1 0.04. n 
- 8). To determine whether ions were 
excluded from nuclei, we loaded intact 
nuclei with the salt torm of indo-t and the 
nuclear Oa : * store depleted with InsP,. 
After depletion of Mores. 100 u.M Mn : ' 
w;i< adJeJ to the hath and the fluores- 
cence within the nucleoplasm was moni- 
tored. All fluorescence in the nucleoplasm 
was tjuenched with the addition ot* Mtr ' , 
without regard to the amount of ('a*' 
within the store. In fact, the rate of 
quench and final fluorescence values were 
not staiistic;illv different in conditions of 
Or* -filled (Fiu- IU or Ca : ' -depleted 
nuclear srores (n = S) (Fiu. 4C), Lower 
!Mn : *| (<1 had similar effects (n = 
4). Taken together, the results Miuuest that 
the nuclear t '.a : ' store regulates the 
movement of molecules ol -10 id hul 
not smaller molecules of ions. 

Our results demonstrate that the nucle- 
ar ("a** store reuulaies si:e-specitic entry 
of molecules into the WmuJuo' t»oevte nu- 
cleus, includinc intermediate-sired mole- 
cules (|0 kin Lickini: :m NI.S that were 
previoiislv thought to pass freelv ihrnuiih 
the .r.iclear pore Complex. We demon- 
strated that (i) movement of 10-kP mol- 
ecules across the nuclear envelope de- 
pended on I .a* ' within the nucte.tr Cister- 
nal lii) depletion of this st,. r f was miUi- 
cieni to hall dilfusion across the envelope 
and did not require the nuclear matrix; 
and (iii) molecules and ions < 500 dallot» 
crossed tin' nuclear envelope lee.irdless ol 
the stare ot the nuclear l',r* stote. The 
met hanisin hv wlmh store depletion 
is sensed hv the nucle.it pote i* unknoun. 
hut the nuilear pore protein. qOO. con- 
tains multiple * *a : ' -hndiiii: doma-ns pn> 
tinted to n-side within the nmlear t.!*- 



u-mae ( J / ). Such a molecule miyht sense 
depletion of the nuclear cisternal |Ca : *| 
and initiate conformational changes that 
hlock intermediate-si:ed molecule diffu- 
sion into the nucleus. 

REFERENCES AND NOTES 



1. C. OwigwnB iind R. Lnskey. Sconce 258. 942 ( 1 992); 

C. Dinijwan. BvEssays 13. 213(1991). 

2. I aivts. Ojtt. Opm. Cea&ot. 4, 424 (1992). 

3. C. W. Akey. J, Cea&ol. 109, 955 (1989). 

4. E. C. Hun, FEBS Lett. 325. 76 (1993). 

5. M. S. Mco»e and G. Btobet. Cca 69. 939 (1992): M. 
Stewart. Co? Bof. 3. 267 (1992): M. A. Powers and 

D. J. Forbes. Co» 79. 031 (1994). 

0 i.Liwg.M Schrfc. R.Peters. J. Ce»Bor. 102. 1183 
( 1 98G): 1- Gerace and 8. Burke. Armu. Rev. Co* So/ 
4. 335 (19B8): P. A. Saver. Ceff 64. 489 (199 1). 

7. L.Stehno Biltel.A.LuCkhoW.O. E. Ctapham. fsteoron 
14. 1G3(t995). 

8. D. O. Matt and J. K. FoskeM. J. Bet. Cham. 269. 
29375(1994). 

9. P. Nicotwa. S. Or emus. T. NOssoo. P. O. Berggren. 
Proc. Natt Acad. So. U.S.A. 87. 6868 (1990): A N. 
Mitviya . P. Rogue. G. Vncendon. tfx*. p. 9270. 

10. G Grynkiewic*. M. Poerwe. R. Y. Tsion. J. B<U 

C/w). 260. 3440 (1985): T. J. Haflom and T. J, Rtr*. 

fr.BS Lett. 1 86. 1 75 ( 1 985): C. Lin. G. Hajnowky. A. 

P. Thomas. CoJCntwn 16. 247 (1994). 
11 u. f 7 . Greber and L. Gerace, J. Cea Bof. 128. 5 

(1995). 

1 ? O. Bachs and N. C. E. Aged. Bkxhm BcphyS. Actn 

1113. 259(1992). 
13. XcnnpuS toevis oocytes were removed from toads 
as dnsenbed |J. D. Lechloitcr and D. E. CtaoMam. 
Coil 69. 233 (1992!). The sail form ot (tucwescout 
dyrjs was miected (50 nl ol 1 .0 mM dye/oocyie) into 
oocyles. Atter 30 min. jetolliculated oocytes wet e 
*»nuclealod manualty by bisecting the oocyte along 
its equator. Nuclei were removed and washed ot 
t;viop<asm in mock intracellular soruton containing 
1 40 mM KCI. 10 mM Hepes. 3 mM MgCU (pH 7.2). 
Loading of ihe nuclear cislerna with membrane- 



Oligonucleotides and motlificd tlerivatives, 
Mich as ph.»spliorothio:ites, are K-ini: exam- 
ined as nntisense therapeutic agents. An 
ideal antisense :ii;ent would he nuclease- 
iesi»t:mt and uncharged to aid cellular |vn- 
eti.it ion. A dramatic deviation troni the 
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phosphor hose hackhone is PNA. originally 
developed to he an antiijene triplexint: 
auent (/). PNA is uncharyetl and st;ihte 
towartl nucleases, anil the achiral PNA 
IvckNnu* (Fin. I A) can Iv synthesiretl hy 
amitie-kised chemistry. 

The first report on TNA demonstrated 
that a T, t , PNA jvolvmer Kmnd its eom- 
pleinent.irv A lt , PNA soi|irence w.th a 
inarkedlv increasetl T M , (temperature at 
which SO"., of douhle-sir.mdcd PNA is 



A Nucleic Acid Triple Helix Formed by a 
Peptide Nucleic Acid-DNA Complex 

Laurie Betts, John A. Josey,* James M. Veal, Steven R. Jordan 

The ctvstal structure of a nucleic add triplex rwpals a helix, designated P-form. that differs 
from previously reported nucleic acid structures. The triplex consists of one polypurine 
DNA strand complexed to a polypyrimidine hairpin peptide nucleic acid (PNA) and was 
successfully designed to promote Watson-Crick and Hoogsteen base pairing. The P-form 
helix is underwound, with a base tilt similar to B-form DNA. The bases are displaced from 
the helix axis even more than in A-form DNA. Hydrogen bonds between the DNA back- 
bone and " ie Hoogsteen PNA backbone explain the observation that polypyrimidine PNA 
sequences form highly stable 2:1 PNA- DNA complexes. This structure expands the 
number of known stable helical forms that nucleic acids can adopt. 
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■< :...in iv. I : 1 rive to ;i T,/A, 0 UNA Jn- 
pk* 4 M T. , !"CA Ji>p!ncL\l the poly- 
,h\miilyl;ite ipoly(T)| portion of a PNA 
duplex target (1,2) :it low ionic strength. 
Further stuJy sliowcJ ;i tendency for 
poly(T) PNA to form 2:1 PNA-PNA 
complexes (3). Several groups have Je- 
>igned ami synthesi:ed f»is- or hnirpin 
PNAs to promote triplex formation hy 
tethering two polypyrimidine PNA strands 
hy flexible linkers (4. 5). These PNAs did 
indeed have increased affinity for single- 
stranded DNA and a higher rate of strand 
invasion for douhle-Mranded PNA, rela- 
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Fig. 1. PNA monomer nnd PNA..-DNA, complex 
structures. (A) Structure of a PNA monomer. 
Backbone torsion angles are indicated hy Greek 
letters according to convention ill). Carbon po- 
sitions are designated by A. D. G. E. and F; 
nitrogen position by B: and oxygen position by 
H. (B) Diagram ol the PNA DNA triplex. DNA was 
cither synthesized by us or pm chased irom 
Research Genetics and used without limner pu- 
rification. PNAs were synlhcsi/nd as described 
(4). The 5-iodo-U base |'T) provided phase in- 
formation and war. a convenient reference point 
in the electron density. The hexapoptide linker. 
His-Gly-Sor-Sor-Gly His. consists of all {L) amino 
actdc. 
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live to » ingle-stranded PNA. To gain in- 
Mght into the manner in which PNAs 
complex with PNA to form triplexes, we 
determined the structure of a nine-base 
hairpin PNA-PNA complex (Fie. IB). 
The structure provides a basis for under- 
standing the high affinity toward nucleic 
acids exhibited by polypyrimidine PNA 
and reveals an unusual helix. 



Table 1. Structure determination. Hairpin PNA- 
DNA complexes were prepared for crystallization 
by annealing. Complex at 0.3 mM was mixed with 
an equal volume of 1 .4 M anvrvxmjrn sulfate and 
0. 1 W tris-HO (pH 8.5) and equilibrated at 22X or 
4X (hanging drop or dialysis buttons) against a 
reservoir at 1 .4 M ammonium sulfate and 0. 1 M tris. 
Crystals were space group P6,,22. a = b = 73.38. 
c •-- 141.28 A. Native data were collected at 
- 1 7CTC at the CHESS A- 1 beamline to 2.8 A, with 
a crystal equilibrated in mother liquor including 
259b glycerol. Data were reduced and scaled with 
DENZO (76). 5-lodo-U derivative crystal data were 
obtained to a resolution of 2.5 A at room tempera- 
ture on a rotating anode with fl-axis II imaging 
plates and processed with fl-axis software. The 
two iodine atoms in the asymmetric unit were lo- 
cated by difference Patterson maps and confirmed 
by anomalous difference Pattersons. Initial SIRAS 
phases to A.O A were calculated and refine with 
the PHASES program (17). Solvent flattening was 
done to 4 .0 A until com'ergence. Heavy atom pa- 
rameters were refined against solvent -flattened 
phases, iterating the process in increasing shells of 
resolution to 2.8 A. A starting model was then built 
into the electron density with the O graphics pro- 
gram ( 18). The crude model consisting of the bases 
and backbone provided a mask for noncrystaito- 
grnphic symmetry averaging and phase extension 
to 2.5 A. The model was rebuilt into the 2.5 A map 
nnd had a crystaflographtc fl factor of 45%. X- 
PLOR simulated annealing refinement (19) with 
AMBER -based (20) parameters with the iodo-U 
crystal data gave a model with an fl factor of 22% 
from CO to 2.5 A. Thirty-five solvent molccutes 
were identified in F n - F,. maps by using a 2.8* 
cutoff. The final model after individual 0 factor re- 
finement gave an R factor of 18.7%. FOM. figure of 
merit; rms. root mean square. 



Fig. 2. Representalive C-G-C base triplet and its 
corresponding electron density. The map was 
calculated with the 2.5 A symmetry -averaged, 
solvent -flattened SIRAS phasn~. nnd is con- 
touiott at lit. 



The structure was solved by using a PNA 
with an iodinated base to provide initial 
phase information (Table I). The SIRAS 
phases were refined hy solvent flattening 
and noncrystallographic symmetry averag- 
ing of the two triplexes in the asymmetric 
unit. The resulting 2.5 A electron density 
map (Fig. 2) clearly shows the positions of 




Data collection 
Parameter Native 



5-todo-U 
PNA 



Resolution 
Total obser.oiions 
Unique reflections 
Completeness (%) 



Statistic 



2.8 
52,3*4 
5,723 
95 
4,1 

SIRAS statistics 

Isomor- 
phous 



2.5 
29.543 
8,374 

93 

4.8 

Anomalous 



Phasing power 



1.41 
0.580 
5.355 



3.31 
4,149t 



Reflections 
used In) 

Overall FOM 0.607 
fleffnemenf 

Resolution (A) 6.0-2.5 

R factor (ft tr J5(%) 18.7 (23.2J 

Average f/<r to 2.5 A 6.2 

Reflections with \F\ > 2a 6.857 

Total number of atoms 1 .94 1 

Water molecules (n) 35 

rms deviation of bond lengths (A) 0.01 4 

rms deviation of bond angles 2.43 
(degrees) 

'R^ is the agreement between all observations of sym- 
metry-related reflections. tft co», (centric) = 

SI''F™J ~ FrwJ - l^fm^l * f>*J. 
whef e Pf, , and Fp^, are the observed derivative and 
native structure amplitudes, respectively, and ^Hrc-n is 
the calculated heavy atom structure factor. ^Number 
of reflections with average anomalous difference >2a\ 
anomalous difference = ]f 4 - where I is 
intensity. $R,„„ is the cross-validation R (actor com- 
puted for the test set of reflections (8% of 1he total), which 
are omitted in the refinement process. 



Table 2. Average torsion angles and helical parameters of PNA-DNA triplex compared with canonical 
A-ONA and B-DNA. The average angle for each torsion was calculated over both triplexes in the 
asymmetric unit. Helical parameters for the DNA portion of the triplex were determined with programs 
described in {21). 



Torsion angles (degrees) 



Molecule 


(i 


P 


7 


h 


r 


k x x, 


X2 X3 


DNA in triplex 


-70 


173 


61 


77 


-161 


-69 -167 




A-DNA 


-50 


172 


41 


79 


-146 


-78 -154 




B-DNA* 


-46 - 


-147 


36 


157 


155 


-96 -98 


-170 89 


PNA in WC Strand 


-103 


73 


70 


93 


165 


1 


PNA in Hsliand 


-108 


G9 


G9 


87 


175 


1 


-175 102 










Helical parameters 






Twist 




Rise 


Base tilt 




Displacement 


Bases 




(degrees) 




(A) 


(A) 




(A) 


per turn 


A-DNA 
B-DNA 
DNA in triplex 


32.7 
36.0 
22.9 




2.6 
3.4 
3.4 


20.0 
-5.9 
5.1 




4.5 
-0.1 
6.8 


11 
10 
16 
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jiie s«-iJ atoms and all of the 

rvp..J" -ink' : • -*bone. 

The v.»;c!v. : '- vciik-J n helix (rig. 
3A) which wc refer to as P-form for PNA. 
with helical parameters '™ ei- 

ther A-form i.r B-form HNA (Table 2). 
The P-form helix has a law cavity along 
the helix axis, reflected hy an average base 
displacement of 6.8 A (Fir. 3B>. compared 
with 4-5 A for A-form DNA. The deoxyri- 
hosi- sugars all have a C3'-endo conforma- 



tion with an average intcrphosphat.- dV- 
tnnce of 6.0 A, similar to A-form ONA. 
This conformation is consistent with the 
f;ict that PNAs, including hairpins, hind 
more tightly to RNA than DNA (5). The 
tilt of the base triplets, however, is more 
similar to that of B-form DNA, where the 
bases are nearly perpendicular to the helix 
axis. The relative orientation of the 
Watson-Crick (WC) strands is such that 
the NH r tcrminus of the PNA is aligned 




wirh the 3' end of the DNA strand. This 
strand orientation preference is consistent 
with studies of mixed base sequences and 
hairpin polypyrimidine >*7 h . e 
Hoogsteen (H) strand of the PNA » 
anriparallcl to the WC PNA stn.nd, as 
designed. ^ 

Within the two rriplexes in the asym- 
metric unit all 10 T-A-T triplets and 7 of 
the 8 C-U-C triplets form with the expect- 
ed hydrogen-bonding geometry and dis- 
tances. The purine base of each triplet 
forms a Watson-Crick base pair with the 
pyrimiJinc base on one PNA strand and a 
Hoogsteen base pair with the l^jj!^ 
on the other strand. This type of (Y'K-Y) 
interaction has been seen in nuclear mag- 
netic resonance (NMR) studies of oligo- 
nucleotide triplexes (7) and predicted for 
PNA, -DNA, complexes (3-6). The aver- 
age distance between the N3 atoms of the 
Hoogsteen tyrosines and N7 atoms of the 
guanines is 2.8 A, indicating thai t those 
atoms must be hydrogen bonded (Fig. I). 
The N3 of the Hoogsteen cynwines must 
be significantly protonated even though 
the pK' (where K, is the acidity constant) 
of a free cytosine is 4.2 and this triple-* was 
crystalli:ed at r H 8.5. The one cytosine 
ihal does not make- the intended Hoog- 
steen interaction, cytosine "\ swings out to 
form an edge to face interaction with cy- 
tosine^ of a crystallographically related 
complex. 

The pattern of base stacking along the 
Watson-Crick portion of the P-form trip- 
lex resembles that found in an A-form 
ON A duplex (8). despite the much larger 
displacement of the bases from the P-fonn 
helix axis. The 2-keto gfoTTps of the Hoog- 
steen pyrimidines stack over the imidazole 
ring of the preceding DNA purine. These 
interaction* are consistent with those in 



DNA 



PNA 




DNAinbothUiplexesisyotlc^ -^ Xe6 ^ ^,0 lines. 

PNA strands. Hydrogen bonds bc.ween .be H PNA and _DNA ^^^^ £. trand of thc 
L*kcr amino acids ared^ 

kirm. vrfh each nxxH r^rosnniinn slightly .no.o ihan one hetenl turn. 

sril-NlT • VOL. 270 • isni-aiMW-Ri'ws 
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Wation-Crtck 
minor groov6 

Fig. 4. Minor groove interactions in an A-T re- 
gion. Solvent molecules are black spheres. Hy- 
drogen bonds are drawn as dashed lines. The 
DNA backbone is filled in gray, and the PNA is 
white. A water molecule links each WC PNA 
backbone amide proton to the 02 keto oxygen 
of the preceding pyrimidine residue on that 
strand. A second row ol waters bridges the first 
row wilh the ribose 04' atom and adenine N3 
atoms As a consequence ol this hydration net- 
work, the WC PNA p. »d DNA backbones arc 
linked together. 



o 



4£ 



<ir.iml. The Oir oxyism fnnn each Plu« 
nh a B nu, P of the UNA kickhonc hyJro- 
£n bonds to the «nuk r-t-m c rf w ch 
Uluc.rf chcPNA backbone of the H 
strand with nn averse d^*"*™: 
There arc extensile van Jer W *»ls m c 
-.ciiniw between the two strands. A total 

H PNA residue is buried on complex tor 
"a,™ with n PNA «Hduc.ln contras * 
.I," W TNA hackhone makes no direU 
interactions with the ONA. 

There are two triplexes >tacked in .he 
asymmetric unU. related by «™j « 
twofold symmetry nerpend.cular to the hi- 
re 3 iixis CF. B . 1A). As a result, the two 
PNA hackNmes 

Mix with the carboxyUte of <^ C ^™ 
.enninus of each of the H TNA cham 
furminii a salt bridj-e with the NH : -tcrmi- 
s o" each WC rNA chain. The am.no 
S hnkers stericaUy ^"T^ 
stacking hetween the top of cadi tnplex 
and any other triplex, preventinutormat,on 

of an infinite helix. , 
The linker peptide was « '"-•"'"MinKr 
in ,he asymmetric .mil adopts a dtlTcrcnt 

within sterically allowed W* 1 ^ *' £ 
age main-chain <empc™'«" factor for .he 
iim i„o «iJ linker region ■» l "* B . ,, 7 
„f the rNA backbone, further indic-i ling 

notimrH"-^'^'!'" 1 " 1 ""'""""""""" 
"'^carc n c.c,r.v defined water mol- 
,c„ es. One class of cedes hmds 

„ the minor groove to both .he am.Je 

■in average distance ol A - ,,nu . 
• 0 V o,yi n of the preceding PT""-^ 
W. wi.h an averse distance rfWA 
IFig. 4). These «ater molecules s...h l.-i 
^rJlLlucs in the WC PNA stran J m « 
conformation that is nearly kIc uea 
,he II rNA strand. They were the stron 
! V r ,aks i M.l»eF..-F Jilference n,^ 

(where F ami F. are *>° " h *™ 1 ' , 
lltb-eJ structure factors, res^) 
•■ml were prcsen. '" the ""i:"™ 1 «'^"" n 
.ensu Another se. of ««cr molecules 

t^the firs.se he N 1 atoms of thc 

,„,,;,„, hases in the minor **«^* 
4). In the case ol OO-l. mplo. > 
„, corresponding ^.er. f**«jf > £ 
cmv of sleiic intetfetence hy the cx'cy 

water* all interact to s,ahili:e .he 1 -I" >" 

Mix M soleen. «^ 

have heen observed .o slaWlr.. Worm 



UNA in A-T rich regions hy the stalled 
nine of hydration (10). In me P-fan» 
TZ< roove. cxocyclic hc.eroatoms of 
h T ^ miJines arc generally solvatcd 
^ h c thcr one or two water 

The structure of .he complc> . exph.it" » 
numher of hi<H-.hemical and hiophys.cal 

»f PNAs. The 
viJ« direct confinnat on that ^ 
recocnirc nucleic acids hy both Watson 
SIX and H-H^teen M-opcn hondmR. 
The conformation of the PNAj »nd DNA 
K?nl.K.ncs rha, ."^^jtfS 
,nd triplex interactions is elucidated by 
ho structure {Table 2). The u*t «oo- 
,,^n between the DNA strand and he 
H PNA strand is stabilised U»h by hydro. 
II M K-vveen .he DNA phosphate 
oxysens and the PNA amides and by «- 
en've van de. Waals inte'act.ons ha 
Kk the structure in the P-form hehca 
conforma.ion. Hydn^n Nmds to s»lven, 

'molecules in the ^.^^Z 
the P-form Warson-Crtck PNA-DNA in 
teractions. This stable triplex accounts for 
he strons tendency of polypynmidine 
PNA sequences to associate with HvP"; 
incllNA ina 2:1 ratio and also account 
for the increased T m of these complexes 
Compared with DNA duplex and triplex 

sequences. m i« m | 

The NMR structure "f a mixeit st 
uuence PNA-RNA duplex has been re- 
cently reported (I I) and was desenbed to 
he most consistent wi.h an A-form W«. 

The PNA Xl » n » ,c 15 VCrV . 

in K <h .he" NMR duplex and x-ray m P « 
licures. so .ha, the carbonyl oMhe 
tertiary ami.tc points toward the UJl>n 
. minus of the PNA strand. The duplex 
no triplex structures differ a. .he a and t 
o sion angles, which specify 
. n of the interresidue amide carbonyl. In 
e refined NMR duplex structures the 
"dominant orientation of the .backbone 
c„U..v ' -ycen is more toward the PNA 
NH...er.ninus. whereas in the T*>™«** 
the "carlxmyl oxyuen poinB toward the TNA 
SxiH.terV.lnus. 1" .he .riplex^ crystal «ruc- 
„,te the confotmation of the. rNA teaducs 
n arly "lentical in the WC stmnd and ^.he 
HMrand. SMWes.1* «»»>'««' »«^T, 
formation consis.cn. f'^Z^il 
WC duplex or a triplex. The observed d.t- 
Trencc-s in backbone conformation be.ween 
the NMR duplex and the x-ray mplcx «»U 
rise from fac.ors s.«h as differences n se- 
ql-ncc length and com^inon «*. " 
compared wi.h crystal, and an RNA com 
pared with a UNA larcel. 

I>NA triplex stmciiires have been s.uJ 
ic d3m.inan.lyhyNMRMns, 1 mes..Khes 

■'triplex is described as heinc more strm- 
/n-fom, (7). where., f « suJ,e 

indicate i. is ntore "milar .o Morm (1^ . •« 
reported for the original f.K-r d.ff...c»'n 



„udie, «r p-W(^)-P«^ M »^J. ( m r l 

(Ml Base triplets have been observed in a 
numner of cr^ta. structure,, bu, the ««em 
of the "helix" is limited to one or two tee 
triplets, usually resulting from an. mic- 
tion between an overhanging base v/«h a 
duplex strand (M). Clearly tnplexfc™n- 
tion may require features that-a-e ^TSs-W 
accommodated by either classical A- or »• 
"elices. The PNA.-DNA, ^np ex 
forms a previtmsly unknown helix that ex 
pat^s.h'e.ibmry^kn^nue^e^he. 

DNA triplexes yet to he discovered. 
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